{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![alt text](https://github.com/callysto/callysto-sample-notebooks/blob/master/notebooks/images/Callysto_Notebook-Banner_Top_06.06.18.jpg?raw=true)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Modeling and predicting carbon dioxide level in atmosphere\n",
    "\n",
    "from Hongmei Zhu, York University for the Callysto Teacher Workshop on Computating and Modeling (Feb. 2019). This content was modified from notebooks created by Michael Lamoureux, University of Calgary and India Heisz, Callysto. \n",
    "\n",
    "Reference: Stewart Calculus: Early transcendentals, Section 1.2 \"Mathematial models: a catalog of essential functions.\"\n",
    "\n",
    "## Data: average carbon dioxide level in atmosphere from 1980 to 2012\n",
    "\n",
    "|Year | 1980|  1982| 1984 | 1986 | 1988 |1990 | 1992 | 1994 |1996  | 1998 | 2000| 2002 |  2004 |  2006 |  2008 |  2010 | 2012 |\n",
    "|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|--|              \n",
    "| CO_2 Level (in ppm)| 338.7 | 341.2 | 344.4 | 347.2 | 351.5| 354.2 | 356.3 | 358.6 | 362.4 | 366.5 | 369.4 | 373.2 | 377.5 | 381.9 | 385.6 | 389.9 | 393.8 |\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Find a model for the carbon dioxide level \n",
    "\n",
    "First let us visulize the data. We make a scatter plot where t represents time (in years) and C represents the average CO2 lever (in parts per million, ppm)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Basic math functions\n",
    "from numpy import *\n",
    "import operator\n",
    "import scipy\n",
    "from scipy.optimize import curve_fit\n",
    "\n",
    "# Jupyter widgets give us the interactive components\n",
    "from ipywidgets import interact, FloatSlider\n",
    "\n",
    "# For plotting\n",
    "from matplotlib.pyplot import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Scatter Plotting\n",
    "\n",
    "We can plot a set of data using the plot command. It is sometimes more useful to plot x and y values together. We can do this by assigning a list of values to the variables t (time in year, the x-axis) and C (CO2 level in ppm, the y-axis), as follows, and then plot."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# assigne values to the axies and convert them into array for numerical calculations\n",
    "t = array(range(1980, 2014, 2))\n",
    "\n",
    "C = array([338.7,341.2,344.4,347.2,351.5,354.2,356.3,358.6,362.4,366.5,369.4,\n",
    "       373.2,377.5,381.9,385.6,389.9,393.8])\n",
    "\n",
    "# making a scatter plot \n",
    "plot(t, C, 'ro')\n",
    "\n",
    "title(\"Average CO2 level in Atmosphere\")\n",
    "xlabel(\"Year\")\n",
    "ylabel(\"CO2 level in ppm\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Find a model for the data \n",
    "\n",
    "There are no fixed rules to find a model. We usually start from observating data, try out simple models and then seek a model that most closely matches the data.  Here, we seek a curve that fits the data in the way that it captures the general trend of the collected data. \n",
    "\n",
    "From observing the plotted data, we can see there is a linear upward trend. This suggests that a linea curve is a natural choice. We try first a straight line, something like\n",
    "\n",
    "$$C_{𝑓𝑖𝑡}(t) =mt+b,$$\n",
    " \n",
    "and try to find good values for the slope $m$  and the y-intercept $b$.\n",
    "\n",
    "Since our data starts at year $t_0=1980$ , it might be more useful to set up the equation as\n",
    "$$ C_{𝑓𝑖𝑡}(t) =m(t−1980)+b.$$\n",
    "\n",
    "Let's find a good values for $m$ and $b$ so that the line fits the data well. The y-intercept $b$ can be obtained from $$C_{𝑓𝑖𝑡}(1980) = 1.721875 (1980 - 1980) + b = 338.7$$\n",
    "Hence $$ b = 338.7 $$ \n",
    "\n",
    "For the slope $m$, we can make some guesses first. We can select the line passing through the first and last data points. The slope m of the line is \n",
    "$$ m = \\frac{393.8-338.7}{2012-1980}=\\frac{55.1}{32} = 1.721875 $$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t = array(range(1980, 2014, 2))\n",
    "\n",
    "m = 1.72\n",
    "b = 338.7\n",
    "Cfit = m*(t-1980) + b\n",
    "\n",
    "# overlay the moded data to the scatter plot of collected data\n",
    "plot(t, C, 'ro', t, Cfit, 'b-')\n",
    "title(\"Average CO2 level in Atmosphere\")\n",
    "xlabel(\"Year\")\n",
    "ylabel(\"CO2 level in ppm\");\n",
    "legend(('Real', 'Modeled'),loc='upper left') "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Trying different slope value m to the data\n",
    "\n",
    "Our first model is not bad. Can we do better? Now let us try with different slope value $m$ by moving the slider below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def update(m=0):\n",
    "    b = 338.7\n",
    "    t = array(range(1980, 2014, 2))\n",
    "    Cfit = m*(t-1980) + b\n",
    "    \n",
    "    # overlay the moded data to the scatter plot of collected data\n",
    "    plot(t, C, 'ro', t, Cfit, 'b-')\n",
    "    title(\"Average CO2 level in Atmosphere\")\n",
    "    xlabel(\"Year\")\n",
    "    ylabel(\"CO2 level in ppm\");\n",
    "    legend(('Real', 'Modeled'),loc='upper left')\n",
    "    \n",
    "interact(update,m=FloatSlider(min=-.020,max=3,step=.01,readout_format='.3f'));"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## An optimal model found automatically by curve fitting function\n",
    "\n",
    "For course, as you might imagine, Python has a function curve_fit that will find the best values of m and b for you.\n",
    "\n",
    "We first define a function $f$ that does our $m (t-1980)+b$ calculation. Then we call the function `curve_fit` to get the best values for m and b. \n",
    "\n",
    "We then plot the results, and print out the best fit values for $m$ and $b$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def f(t,m):\n",
    "    return m*(t-1980) + 338.7\n",
    "\n",
    "best, covar = curve_fit(f,t,C)\n",
    "\n",
    "m = best[0]\n",
    "Cfit = m*(t-1980) + 338.7\n",
    "\n",
    "plot(t, C, 'ro', t, Cfit, 'b-')\n",
    "title(\"Average CO2 level in Atmosphere and Best Linear Model\")\n",
    "xlabel(\"Year\")\n",
    "ylabel(\"CO2 level in ppm\");\n",
    "legend(('Real', 'Best Linear Modele'),loc='upper left')\n",
    "    \n",
    "print(\"m is\", m)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "Now we obtained a good linear model $$ C_{fit} = 1.63 (t-1980) + 338.7$$ What we can draw from this model? \n",
    "\n",
    "1. Notice that the data was collected once in two years. We can estimate the $CO_2$ at any given year. We can ask what the $CO_2$ level in 2007 is roughly. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "m = best[0]\n",
    "C2007 = m*(2007-1980) + 338.7\n",
    "print(\"The $CO_2$ level in 2007 is estimated as\", C2007)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2. Predicting the $CO_2$ lever in the future, say in year 2050? "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "C2050 = m*(2050-1980) + 338.7\n",
    "print(\"The $CO2$ level in 2050 is estimated as\", C2050)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3. From the model, we conclude that $CO_2$ level is rising linearly at a rate (m) of about 1.63 \n",
    "ppm per year. In one hundred year, if temperatures continue to rise at this rate, we will see \n",
    "$CO_2$ level rises by about 163 ppm. we can then study the impact of global warmming and \n",
    "make early prevention  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![alt text](https://github.com/callysto/callysto-sample-notebooks/blob/master/notebooks/images/Callysto_Notebook-Banners_Bottom_06.06.18.jpg?raw=true)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  },
  "nbTranslate": {
   "displayLangs": [
    "*"
   ],
   "hotkey": "alt-t",
   "langInMainMenu": true,
   "sourceLang": "en",
   "targetLang": "fr",
   "useGoogleTranslate": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
